Goto

Collaborating Authors

 image credit


The Vesuvius Challenge is using AI to virtually unroll Pompeii's ancient scrolls

AIHub

A closed carbonised papyrus scroll from Herculaneum being scanned. The Vesuvius Challenge is an unparalleled competition in the field of classical studies, with the potential to pave the way for something akin to a second Renaissance. Its objective is to use artificial intelligence (AI) to virtually unroll hundreds of closed papyrus scrolls, containing ancient literature that has not been seen for 2,000 years. When Mount Vesuvius erupted in AD79, it buried various cities at the Gulf of Naples under massive volcanic material – including Herculaneum, located near Pompeii. In the 18th century, an exceptionally luxurious Roman villa was excavated there, close to the ancient city walls and shoreline. The villa's marvellous wall paintings, mosaics, busts and statues had been conserved by the ashes.


#AAAI2024 workshops round-up 4: eXplainable AI approaches for deep reinforcement learning, and responsible language models

AIHub

Deep reinforcement learning (DRL) has recently made remarkable progress in several application domains, such as games, finance, autonomous driving, and recommendation systems. However, the black-box nature of deep neural networks and the complex interaction among various factors raise challenges in understanding and interpreting the models' decision-making processes. This workshop brought together researchers, practitioners, and experts from both the DRL and the explainable AI communities to focus on methods, techniques, and frameworks that enhance the explainability and interpretability of DRL algorithms. The responsible language models (ReLM) workshop focused on the development, implementation, and applications of LMs aligned with responsible AI principles. Both theoretical and practical challenges regarding the design and deployment of responsible LMs were discussed, including bias identification and quantification, bias mitigation, transparency, privacy and security issues, hallucination, uncertainty quantification, and various other risks associated with LMs.


Diffusing Colors: Image Colorization with Text Guided Diffusion

Zabari, Nir, Azulay, Aharon, Gorkor, Alexey, Halperin, Tavi, Fried, Ohad

arXiv.org Artificial Intelligence

The colorization of grayscale images is a complex and subjective task with significant challenges. Despite recent progress in employing large-scale datasets with deep neural networks, difficulties with controllability and visual quality persist. To tackle these issues, we present a novel image colorization framework that utilizes image diffusion techniques with granular text prompts. This integration not only produces colorization outputs that are semantically appropriate but also greatly improves the level of control users have over the colorization process. Our method provides a balance between automation and control, outperforming existing techniques in terms of visual quality and semantic coherence. We leverage a pretrained generative Diffusion Model, and show that we can finetune it for the colorization task without losing its generative power or attention to text prompts. Moreover, we present a novel CLIP-based ranking model that evaluates color vividness, enabling automatic selection of the most suitable level of vividness based on the specific scene semantics. Our approach holds potential particularly for color enhancement and historical image colorization.


Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data

Naggita, Keziah, LaChance, Julienne, Xiang, Alice

arXiv.org Artificial Intelligence

Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is required to capture image data representative of African people and their environments and, ultimately, to improve the applicability of computer vision models in a global context.


Robot fish makes splash with motion breakthrough

Robohub

The robot fish was fitted with a twisted and coiled polymer (TCP) to drive it forward, a light-weight low cost device that relies on temperature change to generate movement, which also limits its speed. A TCP works by contracting like muscles when heated, converting the energy into mechanical motion. The TCP used in this work is warmed by Joule heating – the pass of current through an electrical conductor produces thermal energy and heats up the conductor. By minimising the distance between the TCP on one side of the robot fish and the spring on the other, this activates the fin at the rear, enabling the robot fish to reach new speeds. The undulating flapping of its rear fin was measured at a frequency of 2Hz, two waves per second.


User spending goes up by more than 4000% on AI-powered apps

#artificialintelligence

Given the rising interest in generative AI tools like text-based ChatGPT and image-based Midjourney, AI-powered apps are growing in numbers and popularity in both app stores. A report by analytics firm Apptopia suggests that 158 AI Chatbot apps -- with the description having keywords like "AI Chat" or "AI Chatbot" -- hit the app stores in the first quarter of this year. The data suggests that multiple apps like Nova AI, Genie AI and Chat with Ask AI have broken into top charts in app stores -- a lot of these apps are similarly named, so it's easy to get confused between them. At the time of writing, Chat with Ask AI is on the top 10 free apps list on iOS in multiple countries. Apptopia mentions in the report that developers are trying to convert AI chatbot tech, which is easily available on a web browser, into a native mobile experience and charging money for it.


Wonder Dynamics puts a full-service CG character studio in a web platform

#artificialintelligence

The tools of modern cinema have become increasingly accessible to independent and even amateur filmmakers, but realistic CG characters (like them or not) have remained the province of big-budget projects. Wonder Dynamics aims to change that with a platform that lets creators literally drag and drop a CG character into any scene as if it was professionally captured and edited. Yes, it sounds a bit like overpromising. Your skepticism is warranted, but as a skeptic myself I have to say I was extremely impressed with what the startup showed of Wonder Studio, the company's web-based editor. This isn't a toy like an AR filter -- it's a full-scale tool, and one that co-founders Nikola Todorovic and Tye Sheridan have longed for themselves.


The takeaways from Stanford's 386-page report on the state of AI

#artificialintelligence

Writing a report on the state of AI must feel a lot like building on shifting sands: By the time you hit publish, the whole industry has changed under your feet. But there are still important trends and takeaways in Stanford's 386-page bid to summarize this complex and fast-moving domain. The AI Index, from the Institute for Human-Centered Artificial Intelligence, worked with experts from academia and private industry to collect information and predictions on the matter. As a yearly effort (and by the size of it, you can bet they're already hard at work laying out the next one), this may not be the freshest take on AI, but these periodic broad surveys are important to keep one's finger on the pulse of industry. This year's report includes "new analysis on foundation models, including their geopolitics and training costs, the environmental impact of AI systems, K-12 AI education, and public opinion trends in AI," plus a look at policy in a hundred new countries.


The takeaways from Stanford's 386-page report on the state of AI

#artificialintelligence

Writing a report on the state of AI must feel a lot like building on shifting sands: By the time you hit publish, the whole industry has changed under your feet. But there are still important trends and takeaways in Stanford's 386-page bid to summarize this complex and fast-moving domain. The AI Index, from the Institute for Human-Centered Artificial Intelligence, worked with experts from academia and private industry to collect information and predictions on the matter. As a yearly effort (and by the size of it, you can bet they're already hard at work laying out the next one), this may not be the freshest take on AI, but these periodic broad surveys are important to keep one's finger on the pulse of industry. This year's report includes "new analysis on foundation models, including their geopolitics and training costs, the environmental impact of AI systems, K-12 AI education, and public opinion trends in AI," plus a look at policy in a hundred new countries.


The week in AI: The pause request heard 'round the world

#artificialintelligence

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here's a handy roundup of the last week's stories in the world of machine learning, along with notable research and experiments we didn't cover on their own. In one of the more surprising stories of the past week, Italy's data protection authority (DPA) blocked OpenAI's viral AI-powered chatbot, ChatGPT, citing concerns that the tool breaches the European Union's General Data Protection Regulation. The DPA is reportedly opening an investigation into whether OpenAI unlawfully processed people's data, as well as over the lack of any system to prevent minors from accessing the tech. It's unclear what the outcome might be; OpenAI has 20 days to respond to the order.